{
  "nbformat": 4,
  "nbformat_minor": 0,
  "metadata": {
    "colab": {
      "name": "Tensorflow_2.0_Tutorial_data_loading.ipynb",
      "provenance": [],
      "collapsed_sections": [],
      "toc_visible": true
    },
    "kernelspec": {
      "name": "python3",
      "display_name": "Python 3"
    }
  },
  "cells": [
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "gsUWFq1o44lW",
        "colab_type": "text"
      },
      "source": [
        "## TF 2.0 initialization"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "hSmIWQBYIEAv",
        "colab_type": "text"
      },
      "source": [
        "Initialize Tensorflow 2.0 and import libraries"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "wCPuNXGF2W6H",
        "colab_type": "code",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 34
        },
        "outputId": "15d1e980-a5f5-4e20-fd22-ca0c2400589b"
      },
      "source": [
        "try:\n",
        "  # %tensorflow_version only exists in Colab.\n",
        "  %tensorflow_version 2.x\n",
        "except Exception:\n",
        "  pass"
      ],
      "execution_count": 1,
      "outputs": [
        {
          "output_type": "stream",
          "text": [
            "TensorFlow 2.x selected.\n"
          ],
          "name": "stdout"
        }
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "fAhhJHc44FX8",
        "colab_type": "code",
        "colab": {}
      },
      "source": [
        "import numpy as np\n",
        "import tensorflow as tf\n",
        "from tensorflow import keras"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "THpXAZNwIqqK",
        "colab_type": "text"
      },
      "source": [
        "## Datasets preveiw\n",
        "\n",
        "Keras provide 7 classic data sets:\n",
        "[Datasets](https://keras.io/datasets/#imdb-movie-reviews-sentiment-classification)\n",
        "* CIFAR10 small image classification\n",
        "* CIFAR100 small image classification\n",
        "* IMDB Movie reviews sentiment classification\n",
        "* Reuters newswire topics classification\n",
        "* MNIST database of handwritten digits\n",
        "* Fashion-MNIST database of fashion articles\n",
        "* Boston housing price regression dataset\n",
        "\n",
        "\n"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "A1EHNKR56hry",
        "colab_type": "text"
      },
      "source": [
        "MINST"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "pjY7nvI34Mpo",
        "colab_type": "code",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 71
        },
        "outputId": "bd54065a-5ed5-4557-e3e2-0a9eadf41689"
      },
      "source": [
        "(x, y),(x_test, y_test) = keras.datasets.mnist.load_data()"
      ],
      "execution_count": 3,
      "outputs": [
        {
          "output_type": "stream",
          "text": [
            "Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/mnist.npz\n",
            "11493376/11490434 [==============================] - 0s 0us/step\n"
          ],
          "name": "stdout"
        }
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "b3VD2lpDI_Qc",
        "colab_type": "code",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 34
        },
        "outputId": "d99e0b68-79c3-4c8e-8390-b60ce16536dd"
      },
      "source": [
        "type(x)"
      ],
      "execution_count": 4,
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "numpy.ndarray"
            ]
          },
          "metadata": {
            "tags": []
          },
          "execution_count": 4
        }
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "wRluf-_040r4",
        "colab_type": "code",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 34
        },
        "outputId": "1fc7de72-bb3b-4003-9f92-1b532edadef7"
      },
      "source": [
        "x.shape, y.shape, x_test.shape, y_test.shape"
      ],
      "execution_count": 5,
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "((60000, 28, 28), (60000,), (10000, 28, 28), (10000,))"
            ]
          },
          "metadata": {
            "tags": []
          },
          "execution_count": 5
        }
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "anqY9pRk5BzD",
        "colab_type": "code",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 34
        },
        "outputId": "aee39884-3a03-4292-9cf5-1fa4b3c028ab"
      },
      "source": [
        "x.min(), x.max(),x.mean()"
      ],
      "execution_count": 6,
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "(0, 255, 33.318421449829934)"
            ]
          },
          "metadata": {
            "tags": []
          },
          "execution_count": 6
        }
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "E8XuTRbu5N1o",
        "colab_type": "code",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 34
        },
        "outputId": "136eedc7-290e-46a5-db7e-5fc2efec40a8"
      },
      "source": [
        "y[:5]"
      ],
      "execution_count": 7,
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "array([5, 0, 4, 1, 9], dtype=uint8)"
            ]
          },
          "metadata": {
            "tags": []
          },
          "execution_count": 7
        }
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "iA6S-zYz5aSu",
        "colab_type": "code",
        "colab": {}
      },
      "source": [
        "y_onehot = tf.one_hot(y,10)"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "IZZuc9sX53Ol",
        "colab_type": "code",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 119
        },
        "outputId": "e5cf2151-f091-4ba0-9bdd-e58679837a58"
      },
      "source": [
        "y_onehot[:5]"
      ],
      "execution_count": 9,
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "<tf.Tensor: id=8, shape=(5, 10), dtype=float32, numpy=\n",
              "array([[0., 0., 0., 0., 0., 1., 0., 0., 0., 0.],\n",
              "       [1., 0., 0., 0., 0., 0., 0., 0., 0., 0.],\n",
              "       [0., 0., 0., 0., 1., 0., 0., 0., 0., 0.],\n",
              "       [0., 1., 0., 0., 0., 0., 0., 0., 0., 0.],\n",
              "       [0., 0., 0., 0., 0., 0., 0., 0., 0., 1.]], dtype=float32)>"
            ]
          },
          "metadata": {
            "tags": []
          },
          "execution_count": 9
        }
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "yZT3clMy6mrY",
        "colab_type": "text"
      },
      "source": [
        "CIFAR 10/100"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "T3iGb2bk56lE",
        "colab_type": "code",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 51
        },
        "outputId": "45a7a1ce-2a30-4ada-d045-9922efbeb33e"
      },
      "source": [
        "(x, y),(x_test, y_test) = keras.datasets.cifar10.load_data()"
      ],
      "execution_count": 10,
      "outputs": [
        {
          "output_type": "stream",
          "text": [
            "Downloading data from https://www.cs.toronto.edu/~kriz/cifar-10-python.tar.gz\n",
            "170500096/170498071 [==============================] - 11s 0us/step\n"
          ],
          "name": "stdout"
        }
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "U9iW8d6662TT",
        "colab_type": "code",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 34
        },
        "outputId": "4cb9929c-78e0-442a-f5e9-d206b3e656da"
      },
      "source": [
        "x.shape, y.shape, x_test.shape, y_test.shape"
      ],
      "execution_count": 11,
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "((50000, 32, 32, 3), (50000, 1), (10000, 32, 32, 3), (10000, 1))"
            ]
          },
          "metadata": {
            "tags": []
          },
          "execution_count": 11
        }
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "tA9G0BEd66Fq",
        "colab_type": "code",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 34
        },
        "outputId": "0c7b6779-015d-4185-e16d-4510d9475260"
      },
      "source": [
        "x.min(), x.max(),x.mean()"
      ],
      "execution_count": 12,
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "(0, 255, 120.70756512369792)"
            ]
          },
          "metadata": {
            "tags": []
          },
          "execution_count": 12
        }
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "fwNXG4_AJoVH",
        "colab_type": "text"
      },
      "source": [
        "### Tensorflow Dataset\n",
        "\n",
        "Convert numpy array to tensorflow datasets"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "BhhHF8U268LM",
        "colab_type": "code",
        "colab": {}
      },
      "source": [
        "ds_test = tf.data.Dataset.from_tensor_slices((x_test, y_test))"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "bTlnnHyhJwZv",
        "colab_type": "code",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 34
        },
        "outputId": "d75410ea-8f77-4e2a-fb4e-25d7708c7bf8"
      },
      "source": [
        "type(ds_test)"
      ],
      "execution_count": 71,
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "tensorflow.python.data.ops.dataset_ops.TensorSliceDataset"
            ]
          },
          "metadata": {
            "tags": []
          },
          "execution_count": 71
        }
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "b7lfYnVJ71Nj",
        "colab_type": "text"
      },
      "source": [
        "TensorflowSliceDataset supports iterater which enables preprocess and multithread process"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "hupY1uZD7XJu",
        "colab_type": "code",
        "colab": {}
      },
      "source": [
        "# it = iter(ds)\n",
        "# next(it)\n",
        "res = next(iter(ds_test)) "
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "i16oRdkOVS6x",
        "colab_type": "code",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 34
        },
        "outputId": "45e95dea-401c-48bf-e3f9-870ab71e4ff9"
      },
      "source": [
        "res[0].shape, res[1].shape"
      ],
      "execution_count": 73,
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "(TensorShape([32, 32, 3]), TensorShape([1]))"
            ]
          },
          "metadata": {
            "tags": []
          },
          "execution_count": 73
        }
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "c_Y_m6SdXH52",
        "colab_type": "text"
      },
      "source": [
        "### tfds\n",
        "\n",
        "Alternative handy data loading tool provided by tensorflow. "
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "IjEOS63bJ0X9",
        "colab_type": "code",
        "colab": {}
      },
      "source": [
        "import tensorflow_datasets as tfds"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "oFvGgb-OJ_WA",
        "colab_type": "code",
        "colab": {}
      },
      "source": [
        "ds2 = tfds.load(name=\"cifar10\")"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "uaeP-8LTQJbA",
        "colab_type": "code",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 71
        },
        "outputId": "007742db-9a68-4a85-baf4-b27f3a5d37db"
      },
      "source": [
        "type(ds2)"
      ],
      "execution_count": 51,
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "{'test': <_OptionsDataset shapes: {image: (32, 32, 3), label: ()}, types: {image: tf.uint8, label: tf.int64}>,\n",
              " 'train': <_OptionsDataset shapes: {image: (32, 32, 3), label: ()}, types: {image: tf.uint8, label: tf.int64}>}"
            ]
          },
          "metadata": {
            "tags": []
          },
          "execution_count": 51
        }
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "UTDPnuUxf7Gp",
        "colab_type": "code",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 34
        },
        "outputId": "e93df059-cd24-41aa-90f7-4da8c61df496"
      },
      "source": [
        "len(list(tfds.as_numpy(ds2)['test']))"
      ],
      "execution_count": 81,
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "10000"
            ]
          },
          "metadata": {
            "tags": []
          },
          "execution_count": 81
        }
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "m-XQLiFtPS6f",
        "colab_type": "code",
        "colab": {}
      },
      "source": [
        "ds2_train, ds2_test = ds2['train'], ds2['test']"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "8-nwOcxkQRiw",
        "colab_type": "code",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 51
        },
        "outputId": "9f702e53-6f20-4e4e-cf8b-85e8d19a11fe"
      },
      "source": [
        "type(ds2_train), type(ds2_test)"
      ],
      "execution_count": 56,
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "(tensorflow.python.data.ops.dataset_ops._OptionsDataset,\n",
              " tensorflow.python.data.ops.dataset_ops._OptionsDataset)"
            ]
          },
          "metadata": {
            "tags": []
          },
          "execution_count": 56
        }
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "tgw2ov6XQjFT",
        "colab_type": "code",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 34
        },
        "outputId": "92cc2afa-a2d7-4144-b045-44caa4bd58cd"
      },
      "source": [
        "next(iter(ds2_train))['image'].shape"
      ],
      "execution_count": 57,
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "TensorShape([32, 32, 3])"
            ]
          },
          "metadata": {
            "tags": []
          },
          "execution_count": 57
        }
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "8Z0235UyU-x6",
        "colab_type": "code",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 34
        },
        "outputId": "c09a88b5-a803-4b60-828c-0a0675ed25e2"
      },
      "source": [
        "type(next(iter(ds2_train))['image'])"
      ],
      "execution_count": 67,
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "tensorflow.python.framework.ops.EagerTensor"
            ]
          },
          "metadata": {
            "tags": []
          },
          "execution_count": 67
        }
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "eqg-rl6ciyDT",
        "colab_type": "code",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 34
        },
        "outputId": "ebfa024c-d3ce-42a7-ba36-cd820f482307"
      },
      "source": [
        "len(list(ds2_train)), len(list(ds2_test))"
      ],
      "execution_count": 84,
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "(50000, 10000)"
            ]
          },
          "metadata": {
            "tags": []
          },
          "execution_count": 84
        }
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "viLHV3q-kvqL",
        "colab_type": "text"
      },
      "source": [
        "### Preprocess"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "TKw_BG8D9iSg",
        "colab_type": "text"
      },
      "source": [
        "Shuffle"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "Gu5_QeCg80LS",
        "colab_type": "code",
        "colab": {}
      },
      "source": [
        "ds2_train= ds2_train.shuffle(1000)"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "CdW-IOSs-hBg",
        "colab_type": "text"
      },
      "source": [
        "Map preprocess function to all the elements in the dataset. A handy tool for data preprocess."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "_RMk3whM9Px6",
        "colab_type": "code",
        "colab": {}
      },
      "source": [
        "def preprocess(x, y):\n",
        "  x = tf.cast(x, tf.float32)/255\n",
        "  y = tf.squeeze(y,axis=0)\n",
        "  y = tf.cast(y, tf.int32)\n",
        "  y = tf.one_hot(y, depth=10)\n",
        "  return x, y"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "bNsOGV6q-irH",
        "colab_type": "code",
        "colab": {}
      },
      "source": [
        "ds_test = ds_test.map(preprocess)"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "oBC0QXqT-sDz",
        "colab_type": "code",
        "colab": {}
      },
      "source": [
        "res = next(iter(ds_test))"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "FG9BIaTQ-8IF",
        "colab_type": "code",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 34
        },
        "outputId": "1fa180db-413a-4391-b562-0f7a56a900b2"
      },
      "source": [
        "res[0].shape, res[1].shape"
      ],
      "execution_count": 92,
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "(TensorShape([32, 32, 3]), TensorShape([10]))"
            ]
          },
          "metadata": {
            "tags": []
          },
          "execution_count": 92
        }
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "excinx1w_Qk5",
        "colab_type": "text"
      },
      "source": [
        "Batch"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "Z8jTKxPR--oZ",
        "colab_type": "code",
        "colab": {}
      },
      "source": [
        "ds_test_b = ds_test.batch(2000)"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "LwTX365x_UxA",
        "colab_type": "code",
        "colab": {}
      },
      "source": [
        "res = next(iter(ds_test_b))"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "e6bCDOBM_f-d",
        "colab_type": "code",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 34
        },
        "outputId": "b7cfa1fe-6af6-4da9-b455-dc9f16d9fe65"
      },
      "source": [
        "res[0].shape, res[1].shape"
      ],
      "execution_count": 95,
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "(TensorShape([2000, 32, 32, 3]), TensorShape([2000, 10]))"
            ]
          },
          "metadata": {
            "tags": []
          },
          "execution_count": 95
        }
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "IVBLJfnb_hda",
        "colab_type": "code",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 102
        },
        "outputId": "1b449d07-4513-413f-e145-5e4860c36cb2"
      },
      "source": [
        "# i = 0\n",
        "# for x, y in ds_test_b:\n",
        "#   print(i, x.shape, y.shape)\n",
        "#   i += 1\n",
        "\n",
        "for i, (x, y) in enumerate(ds_test_b):\n",
        "  print(i, x.shape, y.shape)"
      ],
      "execution_count": 99,
      "outputs": [
        {
          "output_type": "stream",
          "text": [
            "0 (2000, 32, 32, 3) (2000, 10)\n",
            "1 (2000, 32, 32, 3) (2000, 10)\n",
            "2 (2000, 32, 32, 3) (2000, 10)\n",
            "3 (2000, 32, 32, 3) (2000, 10)\n",
            "4 (2000, 32, 32, 3) (2000, 10)\n"
          ],
          "name": "stdout"
        }
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "axWziMwcBVfK",
        "colab_type": "code",
        "colab": {}
      },
      "source": [
        "df4 = df3.repeat(2)"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "EsC7Pkg7DZrE",
        "colab_type": "code",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 102
        },
        "outputId": "4638752b-cc27-42fe-9d24-3125683f96d0"
      },
      "source": [
        "for i, (x, y) in enumerate(ds_test_b):\n",
        "  print(i, x.shape, y.shape)"
      ],
      "execution_count": 100,
      "outputs": [
        {
          "output_type": "stream",
          "text": [
            "0 (2000, 32, 32, 3) (2000, 10)\n",
            "1 (2000, 32, 32, 3) (2000, 10)\n",
            "2 (2000, 32, 32, 3) (2000, 10)\n",
            "3 (2000, 32, 32, 3) (2000, 10)\n",
            "4 (2000, 32, 32, 3) (2000, 10)\n"
          ],
          "name": "stdout"
        }
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "7RmIRytTDkIH",
        "colab_type": "text"
      },
      "source": [
        "## Full data loading example\n",
        "fashion mnist"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "ALxGB5gGDikA",
        "colab_type": "code",
        "colab": {}
      },
      "source": [
        "def mnist_preprocess(x, y):\n",
        "  x = tf.cast(x, tf.float32)/255\n",
        "  # y = tf.squeeze(y,axis=0)\n",
        "  y = tf.cast(y, tf.int64)\n",
        "  y = tf.one_hot(y, depth=10)\n",
        "  return x, y"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "6aPZUcE2Db1A",
        "colab_type": "code",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 173
        },
        "outputId": "da2b324a-ddeb-45b4-c2ca-fecb2ba300a8"
      },
      "source": [
        "(x, y), (x_test, y_test) = keras.datasets.fashion_mnist.load_data()"
      ],
      "execution_count": 102,
      "outputs": [
        {
          "output_type": "stream",
          "text": [
            "Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/train-labels-idx1-ubyte.gz\n",
            "32768/29515 [=================================] - 0s 0us/step\n",
            "Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/train-images-idx3-ubyte.gz\n",
            "26427392/26421880 [==============================] - 0s 0us/step\n",
            "Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/t10k-labels-idx1-ubyte.gz\n",
            "8192/5148 [===============================================] - 0s 0us/step\n",
            "Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/t10k-images-idx3-ubyte.gz\n",
            "4423680/4422102 [==============================] - 0s 0us/step\n"
          ],
          "name": "stdout"
        }
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "SeuZqrxRE3Vv",
        "colab_type": "code",
        "colab": {}
      },
      "source": [
        "ds = tf.data.Dataset.from_tensor_slices((x,y))\n",
        "ds = ds.map(mnist_preprocess)\n",
        "ds = ds.shuffle(1000).batch(100)\n",
        "\n",
        "ds_test = tf.data.Dataset.from_tensor_slices((x_test,y_test))\n",
        "ds_test = ds_test.map(mnist_preprocess)\n",
        "ds_test = ds_test.shuffle(1000).batch(100)"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "9v1mHlt1FFgp",
        "colab_type": "code",
        "colab": {}
      },
      "source": [
        "res = next(iter(ds))\n",
        "res_test = next(iter(ds_test))"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "EqTEIkzLFTB3",
        "colab_type": "code",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 85
        },
        "outputId": "b9c635ee-1a47-4308-f036-b5f0ddd2e5eb"
      },
      "source": [
        "res[0].shape, res[1].shape, res_test[0].shape, res_test[1].shape"
      ],
      "execution_count": 105,
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "(TensorShape([100, 28, 28]),\n",
              " TensorShape([100, 10]),\n",
              " TensorShape([100, 28, 28]),\n",
              " TensorShape([100, 10]))"
            ]
          },
          "metadata": {
            "tags": []
          },
          "execution_count": 105
        }
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "ijXubuqKFeN_",
        "colab_type": "code",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 1000
        },
        "outputId": "5c559237-0c20-4564-db8d-290756ddcf02"
      },
      "source": [
        "for step, (x,y) in enumerate(ds_test):\n",
        "  print(step, x.shape, y.shape)"
      ],
      "execution_count": 106,
      "outputs": [
        {
          "output_type": "stream",
          "text": [
            "0 (100, 28, 28) (100, 10)\n",
            "1 (100, 28, 28) (100, 10)\n",
            "2 (100, 28, 28) (100, 10)\n",
            "3 (100, 28, 28) (100, 10)\n",
            "4 (100, 28, 28) (100, 10)\n",
            "5 (100, 28, 28) (100, 10)\n",
            "6 (100, 28, 28) (100, 10)\n",
            "7 (100, 28, 28) (100, 10)\n",
            "8 (100, 28, 28) (100, 10)\n",
            "9 (100, 28, 28) (100, 10)\n",
            "10 (100, 28, 28) (100, 10)\n",
            "11 (100, 28, 28) (100, 10)\n",
            "12 (100, 28, 28) (100, 10)\n",
            "13 (100, 28, 28) (100, 10)\n",
            "14 (100, 28, 28) (100, 10)\n",
            "15 (100, 28, 28) (100, 10)\n",
            "16 (100, 28, 28) (100, 10)\n",
            "17 (100, 28, 28) (100, 10)\n",
            "18 (100, 28, 28) (100, 10)\n",
            "19 (100, 28, 28) (100, 10)\n",
            "20 (100, 28, 28) (100, 10)\n",
            "21 (100, 28, 28) (100, 10)\n",
            "22 (100, 28, 28) (100, 10)\n",
            "23 (100, 28, 28) (100, 10)\n",
            "24 (100, 28, 28) (100, 10)\n",
            "25 (100, 28, 28) (100, 10)\n",
            "26 (100, 28, 28) (100, 10)\n",
            "27 (100, 28, 28) (100, 10)\n",
            "28 (100, 28, 28) (100, 10)\n",
            "29 (100, 28, 28) (100, 10)\n",
            "30 (100, 28, 28) (100, 10)\n",
            "31 (100, 28, 28) (100, 10)\n",
            "32 (100, 28, 28) (100, 10)\n",
            "33 (100, 28, 28) (100, 10)\n",
            "34 (100, 28, 28) (100, 10)\n",
            "35 (100, 28, 28) (100, 10)\n",
            "36 (100, 28, 28) (100, 10)\n",
            "37 (100, 28, 28) (100, 10)\n",
            "38 (100, 28, 28) (100, 10)\n",
            "39 (100, 28, 28) (100, 10)\n",
            "40 (100, 28, 28) (100, 10)\n",
            "41 (100, 28, 28) (100, 10)\n",
            "42 (100, 28, 28) (100, 10)\n",
            "43 (100, 28, 28) (100, 10)\n",
            "44 (100, 28, 28) (100, 10)\n",
            "45 (100, 28, 28) (100, 10)\n",
            "46 (100, 28, 28) (100, 10)\n",
            "47 (100, 28, 28) (100, 10)\n",
            "48 (100, 28, 28) (100, 10)\n",
            "49 (100, 28, 28) (100, 10)\n",
            "50 (100, 28, 28) (100, 10)\n",
            "51 (100, 28, 28) (100, 10)\n",
            "52 (100, 28, 28) (100, 10)\n",
            "53 (100, 28, 28) (100, 10)\n",
            "54 (100, 28, 28) (100, 10)\n",
            "55 (100, 28, 28) (100, 10)\n",
            "56 (100, 28, 28) (100, 10)\n",
            "57 (100, 28, 28) (100, 10)\n",
            "58 (100, 28, 28) (100, 10)\n",
            "59 (100, 28, 28) (100, 10)\n",
            "60 (100, 28, 28) (100, 10)\n",
            "61 (100, 28, 28) (100, 10)\n",
            "62 (100, 28, 28) (100, 10)\n",
            "63 (100, 28, 28) (100, 10)\n",
            "64 (100, 28, 28) (100, 10)\n",
            "65 (100, 28, 28) (100, 10)\n",
            "66 (100, 28, 28) (100, 10)\n",
            "67 (100, 28, 28) (100, 10)\n",
            "68 (100, 28, 28) (100, 10)\n",
            "69 (100, 28, 28) (100, 10)\n",
            "70 (100, 28, 28) (100, 10)\n",
            "71 (100, 28, 28) (100, 10)\n",
            "72 (100, 28, 28) (100, 10)\n",
            "73 (100, 28, 28) (100, 10)\n",
            "74 (100, 28, 28) (100, 10)\n",
            "75 (100, 28, 28) (100, 10)\n",
            "76 (100, 28, 28) (100, 10)\n",
            "77 (100, 28, 28) (100, 10)\n",
            "78 (100, 28, 28) (100, 10)\n",
            "79 (100, 28, 28) (100, 10)\n",
            "80 (100, 28, 28) (100, 10)\n",
            "81 (100, 28, 28) (100, 10)\n",
            "82 (100, 28, 28) (100, 10)\n",
            "83 (100, 28, 28) (100, 10)\n",
            "84 (100, 28, 28) (100, 10)\n",
            "85 (100, 28, 28) (100, 10)\n",
            "86 (100, 28, 28) (100, 10)\n",
            "87 (100, 28, 28) (100, 10)\n",
            "88 (100, 28, 28) (100, 10)\n",
            "89 (100, 28, 28) (100, 10)\n",
            "90 (100, 28, 28) (100, 10)\n",
            "91 (100, 28, 28) (100, 10)\n",
            "92 (100, 28, 28) (100, 10)\n",
            "93 (100, 28, 28) (100, 10)\n",
            "94 (100, 28, 28) (100, 10)\n",
            "95 (100, 28, 28) (100, 10)\n",
            "96 (100, 28, 28) (100, 10)\n",
            "97 (100, 28, 28) (100, 10)\n",
            "98 (100, 28, 28) (100, 10)\n",
            "99 (100, 28, 28) (100, 10)\n"
          ],
          "name": "stdout"
        }
      ]
    }
  ]
}